智能论文笔记

A cost effective eye movement tracker based wheel chair control algorithm for people with paraplegia

Skanda Upadhyaya , Shravan Bhat , Siddhanth P. Rao , V Ashwin , Krishnan Chemmangat

分类：人工智能 | 机器人

2022-07-21

脊髓损伤通常会导致四肢瘫痪的患者限制其活动能力。轮椅对于患者来说可能是一个很好的主张，但大多数人可以手动操作，也可以借助操纵杆操作的电动机。但是，这需要使用手，使其不适合四肢瘫痪的患者。另一方面，即使受到脑损伤的人，控制眼动的运动也保留了。监视眼睛中的运动可能是为轮椅生成控制信号的有用工具。本文是通过试图控制模仿轮椅的机器人来转换从眼睛转换为有意义的信号的一种方法。总体系统具有成本效益，并使用简单的图像处理和模式识别来控制机器人。开发了一种Android应用，在实际情况下，患者的援助可以使用该应用程序，以更加完善轮椅。

translated by 谷歌翻译

Towards Automating Retinoscopy for Refractive Error Diagnosis

Aditya Aggarwal , Siddhartha Gairola , Uddeshya Upadhyay , Akshay P Vasishta , Diwakar Rao , Aditya Goyal , Kaushik Murali , Nipun Kwatra , Mohit Jain

分类：计算机视觉

2022-08-10

折射率是最常见的眼睛障碍，是可更正视觉障碍的关键原因，造成了美国近80％的视觉障碍。可以使用多种方法诊断折射误差，包括主观折射，视网膜镜检查和自动磨蚀器。尽管主观折射是黄金标准，但它需要患者的合作，因此不适合婴儿，幼儿和发育迟缓的成年人。视网膜镜检查是一种客观折射方法，不需要患者的任何输入。但是，视网膜镜检查需要镜头套件和训练有素的检查员，这限制了其用于大规模筛查的使用。在这项工作中，我们通过将智能手机连接到视网膜镜和录制视网膜镜视频与患者戴着定制的纸框架来自动化自动化。我们开发了一个视频处理管道，该管道将视网膜视频视为输入，并根据我们提出的视网膜镜检查数学模型的扩展来估算净屈光度错误。我们的系统减轻了对镜头套件的需求，可以由未经培训的检查员进行。在一项185只眼睛的临床试验中，我们的灵敏度为91.0％，特异性为74.0％。此外，与主观折射测量相比，我们方法的平均绝对误差为0.75 $ \ pm $ 0.67D。我们的结果表明，我们的方法有可能用作现实世界中医疗设置中的基于视网膜镜检查的折射率筛选工具。

translated by 谷歌翻译

Hyper-Universal Policy Approximation: Learning to Generate Actions from a Single Image using Hypernets

Dimitrios C. Gklezakos , Rishi Jha , Rajesh P. N. Rao

分类：机器学习

2022-07-07

受到吉布森（Gibson）在人类视野中提供的对象的概念的启发，我们提出了一个问题：代理商如何学会对只有单一瞥见的新物体或环境进行整个行动政策进行预测？为了解决这个问题，我们介绍了通用政策功能（UPF）的概念，这些概念是状态到行动映射，不仅可以推广到新目标，而且最重要的是对新颖，看不见的环境。具体而言，我们考虑了有效地学习计算能力和通信能力有限的代理商的政策的问题，这些策略是在边缘设备中经常遇到的约束。我们提出了Hyper-Universal策略近似器（HUPA），这是一种基于超网络的模型，可从单个图像中生成小型任务和环境条件策略网络，具有良好的概括属性。我们的结果表明，HUPA的表现明显优于基于嵌入的替代方案，用于生成大小约束的策略。尽管这项工作仅限于简单的基于地图的导航任务，但未来的工作包括将HUPA背后的原理应用于学习对象和环境的更多一般负担。

translated by 谷歌翻译

Recursive Neural Programs: Variational Learning of Image Grammars and Part-Whole Hierarchies

Ares Fisher , Rajesh P. N. Rao

分类：计算机视觉 | 机器学习

2022-06-16

人类的视野涉及使用基于部分整体层次结构的结构化表示形式解析和表示对象和场景。计算机视觉和机器学习研究人员最近试图使用胶囊网络，参考框架和主动预测编码来模仿此功能，但是缺乏生成模型的配方。我们介绍递归神经程序（RNP），据我们所知，这是解决部分整体层次学习问题的第一个神经生成模型。 RNPS模型图像作为概率感觉运动程序的分层树，递归重复使用学习感觉运动原始图，以在不同的参考帧中建模图像，形成递归图像语法。我们将RNP表示为用于推理和采样的结构化变异自动编码器（SVAE），并展示了MNIST，Omniglot和Fashion-Mnist数据集的基于零件的解析，采样和单次传输学习，展示了模型的表现力。我们的结果表明，RNP提供了组合对象和场景的直观和可解释的方式，从而可以根据部分整体层次结构对对象的丰富组成性和直观的解释。

translated by 谷歌翻译

Emergent behavior and neural dynamics in artificial agents tracking turbulent plumes

Satpreet Harcharan Singh , Floris van Breugel , Rajesh P. N. Rao , Bingni Wen Brunton

分类：人工智能 | 机器学习 | 神经与进化计算

2021-09-25

跟踪湍流羽流以定位其源是一个复杂的控制问题，因为它需要多感觉集成，并且必须强大地间歇性气味，更改风向和可变羽流统计。这项任务是通过飞行昆虫进行常规进行的，通常是长途跋涉，以追求食物或配偶。在许多实验研究中已经详细研究了这种显着行为的几个方面。在这里，我们采用硅化方法互补，采用培训，利用加强学习培训，开发对支持羽流跟踪的行为和神经计算的综合了解。具体而言，我们使用深增强学习（DRL）来训练经常性神经网络（RNN）代理以定位模拟湍流羽毛的来源。有趣的是，代理人的紧急行为类似于飞行昆虫，而RNNS学会代表任务相关变量，例如自上次气味遭遇以来的头部方向和时间。我们的分析表明了一种有趣的实验可测试的假设，用于跟踪风向改变的羽毛 - 该试剂遵循局部羽状形状而不是电流风向。虽然反射短记忆行为足以跟踪恒定风中的羽毛，但更长的记忆时间表对于跟踪切换方向的羽毛是必不可少的。在神经动力学的水平下，RNNS的人口活动是低维度的，并且组织成不同的动态结构，与行为模块一些对应。我们的Silico方法提供了湍流羽流跟踪策略的关键直觉，并激励未来的目标实验和理论发展。

translated by 谷歌翻译

Multilingual Audio-Visual Smartphone Dataset And Evaluation

Hareesh Mandalapu , Aravinda Reddy P N , Raghavendra Ramachandra , K Sreenivasa Rao , Pabitra Mitra , S R Mahadeva Prasanna , Christoph Busch

分类：计算机视觉

2021-09-09

智能手机已经使用基于生物识别的验证系统，以在高度敏感的应用中提供安全性。视听生物识别技术因其可用性而受欢迎，并且由于其多式化性质，欺骗性将具有挑战性。在这项工作中，我们介绍了一个在五个不同最近智能手机中捕获的视听智能手机数据集。考虑到不同的现实情景，这个新数据集包含在三个不同的会话中捕获的103个科目。在该数据集中获取三种不同的语言，以包括扬声器识别系统的语言依赖性问题。这些数据集的这些独特的特征将为实施新的艺术技术的单向或视听扬声器识别系统提供途径。我们还报告了DataSet上的基准标记的生物识别系统的性能。生物识别算法的鲁棒性朝向具有广泛实验的重播和合成信号等信号噪声，设备，语言和呈现攻击等多种依赖性。获得的结果提出了许多关于智能手机中最先进的生物识别方法的泛化特性的担忧。

translated by 谷歌翻译

Protein-Ligand Complex Generator & Drug Screening via Tiered Tensor Transform

Jonathan P. Mailoa , Zhaofeng Ye , Jiezhong Qiu , Chang-Yu Hsieh , Shengyu Zhang

分类：神经与进化计算

2023-01-03

Accurate determination of a small molecule candidate (ligand) binding pose in its target protein pocket is important for computer-aided drug discovery. Typical rigid-body docking methods ignore the pocket flexibility of protein, while the more accurate pose generation using molecular dynamics is hindered by slow protein dynamics. We develop a tiered tensor transform (3T) algorithm to rapidly generate diverse protein-ligand complex conformations for both pose and affinity estimation in drug screening, requiring neither machine learning training nor lengthy dynamics computation, while maintaining both coarse-grain-like coordinated protein dynamics and atomistic-level details of the complex pocket. The 3T conformation structures we generate are closer to experimental co-crystal structures than those generated by docking software, and more importantly achieve significantly higher accuracy in active ligand classification than traditional ensemble docking using hundreds of experimental protein conformations. 3T structure transformation is decoupled from the system physics, making future usage in other computational scientific domains possible.

translated by 谷歌翻译

Posterior Collapse and Latent Variable Non-identifiability

Yixin Wang , David M. Blei , John P. Cunningham

分类： (统计)机器学习 | 机器学习

2023-01-02

Variational autoencoders model high-dimensional data by positing low-dimensional latent variables that are mapped through a flexible distribution parametrized by a neural network. Unfortunately, variational autoencoders often suffer from posterior collapse: the posterior of the latent variables is equal to its prior, rendering the variational autoencoder useless as a means to produce meaningful representations. Existing approaches to posterior collapse often attribute it to the use of neural networks or optimization issues due to variational approximation. In this paper, we consider posterior collapse as a problem of latent variable non-identifiability. We prove that the posterior collapses if and only if the latent variables are non-identifiable in the generative model. This fact implies that posterior collapse is not a phenomenon specific to the use of flexible distributions or approximate inference. Rather, it can occur in classical probabilistic models even with exact inference, which we also demonstrate. Based on these results, we propose a class of latent-identifiable variational autoencoders, deep generative models which enforce identifiability without sacrificing flexibility. This model class resolves the problem of latent variable non-identifiability by leveraging bijective Brenier maps and parameterizing them with input convex neural networks, without special variational inference objectives or optimization tricks. Across synthetic and real datasets, latent-identifiable variational autoencoders outperform existing methods in mitigating posterior collapse and providing meaningful representations of the data.

translated by 谷歌翻译

A Functional approach for Two Way Dimension Reduction in Time Series

Aniruddha Rajendra Rao , Haiyan Wang , Chetan Gupta

分类：机器学习

2023-01-01

The rise in data has led to the need for dimension reduction techniques, especially in the area of non-scalar variables, including time series, natural language processing, and computer vision. In this paper, we specifically investigate dimension reduction for time series through functional data analysis. Current methods for dimension reduction in functional data are functional principal component analysis and functional autoencoders, which are limited to linear mappings or scalar representations for the time series, which is inefficient. In real data applications, the nature of the data is much more complex. We propose a non-linear function-on-function approach, which consists of a functional encoder and a functional decoder, that uses continuous hidden layers consisting of continuous neurons to learn the structure inherent in functional data, which addresses the aforementioned concerns in the existing approaches. Our approach gives a low dimension latent representation by reducing the number of functional features as well as the timepoints at which the functions are observed. The effectiveness of the proposed model is demonstrated through multiple simulations and real data examples.

translated by 谷歌翻译

Pseudo-Inverted Bottleneck Convolution for DARTS Search Space

Arash Ahmadian , Yue Fei , Louis S. P. Liu , Konstantinos N. Plataniotis , Mahdi S. Hosseini

分类：机器学习

2022-12-31

Differentiable Architecture Search (DARTS) has attracted considerable attention as a gradient-based Neural Architecture Search (NAS) method. Since the introduction of DARTS, there has been little work done on adapting the action space based on state-of-art architecture design principles for CNNs. In this work, we aim to address this gap by incrementally augmenting the DARTS search space with micro-design changes inspired by ConvNeXt and studying the trade-off between accuracy, evaluation layer count, and computational cost. To this end, we introduce the Pseudo-Inverted Bottleneck conv block intending to reduce the computational footprint of the inverted bottleneck block proposed in ConvNeXt. Our proposed architecture is much less sensitive to evaluation layer count and outperforms a DARTS network with similar size significantly, at layer counts as small as 2. Furthermore, with less layers, not only does it achieve higher accuracy with lower GMACs and parameter count, GradCAM comparisons show that our network is able to better detect distinctive features of target objects compared to DARTS.

translated by 谷歌翻译